The private capacity of quantum channels is not additive 
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Recently there has been considerable activity on the subject of additivity of various quantum 
channel capacities. Here, we construct a family of channels with sharply bounded classical, hence 
private capacity. On the other hand, their quantum capacity when combined with a zero private 
(and zero quantum) capacity erasure channel, becomes larger than the previous classical capacity. 
As a consequence, we can conclude for the first time that the classical private capacity is non- 
additive. In fact, in our construction even the quantum capacity of the tensor product of two 
channels can be greater than the sum of their individual classical private capacities. We show that 
this violation occurs quite generically: every channel can be embedded into our construction, and 
a violation occurs whenever the given channel has larger entanglement assisted quantum capacity 
than (unassisted) classical capacity. 



Information Theory, established by Claude Shannon 
in the 1940s as a "Mathematical Theory of Communica- 
tion" is the theoretical foundation of today's commu- 
nication technologies. The main problem it is concerned 
with is how much information can be transmitted down 
a noisy channel asymptotically, i.e. the capacity of the 
channel. Shannon provided a beautifully simple formula 
for the capacity of a discrete memoryless channel, which 
only involves an entropic expression of a single channel 
use. Subsequent research revealed that this simple capac- 
ity formula fully characterizes the information-carrying 
capability of a channel under a large range of circum- 
stances |2j, serving as a very robust measure. E.g., the 
ability of two channels together to transmit information 
is quantified by the sum of their individual capacities. 

Our world however is not the classical one of Shan- 
non's noisy channels, but is at a basic level described by 
quantum theory. To understand the ultimate limit the 
laws of physics impose on our ability to communicate, 
the underlying quantum behavior of the channels should 
be considered. A quantum channel Af is mathematically 
described by an isometric map V from the input Hilbert 
space A to the combined Hilbert space of the output B 
and the so-called environment system E. Then the chan- 
nel and its natural complement Af act as 

Af{p) = Tr E VpV\ Kf{p) — Trs VpVl 

It can in general not only convey classical messages, but 
also quantum data, i.e. a Hilbert space of quantum states. 
It can also carry classical private information, inaccessi- 
ble to the environment, enabling the classically impossi- 
ble, provably unconditionally secure key distribution Q. 
Naturally, deriving capacity formulae of a quantum chan- 
nel for transmitting various kinds of information is a cen- 
tral task of quantum information theory. 

The classical capacity, C(Af), is the maximal rate of 
classical information that the quantum channel Af can 



asymptotically transmit with vanishing errors. In con- 
trast to the classical capacity, the definition of classi- 
cal private capacity P(Af) further requires that the clas- 
sical information conveyed is secret from the environ- 
ment. Finally, the quantum capacity Q(Af) quantifies 
how large a Hilbert space of states the channel Af can 
transmit asymptotically and with the error approaching 
zero. Operationally, quantum information transmission 
implies private classical transmission, which in turn im- 
plies plain classical communication. I.e., 



C(Af) > P{Af) > Q{Af). 



(1) 



Despite considerable progress, tractable formulae for 
the quantum, private and unrestricted classical capaci- 
ties are still out of reach. The HSW theorem Dcve- 
tak Q and the LSD theorem @, @] give the classical, 
private and quantum capacities, respectively, as the reg- 
ularisation of single- letter quantities: 



X(A0 < C(Af) 


= lim 
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P (1) (A0 < P(Af) 
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-pW(Af® n ), 
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Q (1) (A0 <Q(Af) 


= lim 

n — >oo 


-QW(Af® n ). 
n 


(4) 



All three single-letter quantities are obtained via finite 
optimizations of entropic expressions: the Holevo capac- 
ity x(N) is the maximum over all ensembles {pt, Pi} of 
states on A of the Holevo information 

Xta,M(A0 =H (Af^Pip^-Y^piHiAfipi)), (5) 

where H(p) = — Trplogp is the von Neumann en- 
tropy (log is always the binary logarithm). Similarly, 

pW(Af) = max {pi)(3i} (x Wi} (A0 - x {Pi ,p l} (A0) , and 
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= maxp I c (p,J\f), with the coherent informa- 
tion H 



I c {p,N)=H{N{p))-H{X{p)). 



(6) 



Neither x(AQ, nor Q (1) (AQ, nor P^^AQare additive; 
in fact, x 7^ C" S, ^ P El, # Q 53- However, 
for certain classes of channels it is known that C(M) = 

x(ao mmm, and for other ciasses p w = pii) w^ 

Q{M) = Q (1) (7V) [13. 

As measures of a channel's information transmitting 
capability, the above three capacity quantities might be 
expected to be robust, i.e. just like Shannon's capacity for 
classical channels, to be applicable under a large range of 
settings. While this is no longer true when various auxil- 
iary resources (e.g. free entanglement or classical commu- 
nications) are available [3] , another weird feature of the 
quantum capacity was discovered recently. Smith and 
Yard [13] show that, as a function of channels, Q(Af) is 
not additive. Specifically, for the two channels A/i and 
7V 2 with Afi satisfying Q(M) = and P(M) > 0, and 
A/2 the (zero quantum and zero private capacity) 50% 
erasure channel, Q(A/i <g> N 2 ) > ^-P(M) > 0. One might 
attribute this supcractivation of quantum capacity to the 
ability to transmit privacy [IH, recalling the close rela- 
tionship between Q{M) and P(M) [3]. But surprisingly 
again, Smith and Smolin [i^l have found two channels 
such that either they have large joint quantum capacity 
but negligible individual private classical capacities, or 
one of them exhibits a large non-additivity of \- 

In this Letter, we present quantum channels Tfy for 
given channel J\f with finite environment dimension (this 
includes all channels with finite dimensional input and 
output), and integer k; it inherits input and output from 
A/", but has also auxiliary registers. We can show that 
C{N) < C{T£) < C{N) + 5(k), where 5(k) goes to 
zero as k — > 00. Regarding the capability of the channel 
Tjfy, together with a 50% erasure channel A, for quantum 
communication, we find that the quantum capacity of the 
combined channel Tjj- (g> A is lower bounded by Qe(N), 
the entanglement-assisted quantum capacity of M [2l| . 
So, for channels Af such that Qe{N) > C(N), T$ - 
when combined with the above erasure channel - can 
transmit more quantum information than its classical ca- 
pacity C(Th). Referring to Eq. flj), we conclusively prove 
that the classical private capacity, in fact even the quan- 
tum capacity, of two channels can be greater than the 
sum of their individual classical private capacities. Our 
findings not only demonstrate that the classical private 
capacity of a quantum channel is generally not additive, 
but also yield another counterexample to the additivity 
of quantum capacity, of which the underly ing reasoning 
is different from that of Smith and Yard's |17j . 

The channel construction. In the Stinespring repre- 
sentation Af(p) = Tie VpV the partial trace embodies 
all the noise of the channel as loss of information; if Bob 



got E as well, there would be no noise at all as he can 
undo the isomctry. However, a well-known way of giving 
him E anyway, is to completely randomize it: denoting 
the discrete Wcyl operators on E by Wj (j = 1, . . . ,\E\ 2 ), 
if the channel internally picks j uniformly at random and 
applies Wj to E, it creates a new channel with output 
N{p) B ® ~\e~\^ E ■ The extra register is always constant, so 
the new channel has the exact same information proper- 
ties as N '. The idea of the following channel construction 
is to add another "gadget" on top of this, which outputs 
some randomness approximating the uniform distribu- 
tion above - see Fig. [TJ so, intuitively, on its own it does 
not alter too much the classical capacity of the channel, 
but if paired with the right resources can increase the 
quantum capacity. 



! control 
\ output 




FIG. 1: The channel T^-: in the lower part it contains M (with 
input register A2, output register B2, and environment E). 
It also has another input register A\ of dimension c— |-B| 2fc , 
which we view in a fixed way as a tensor product of k \E\ 2 - 
dimensional systems An, . . . , Aik, each coming with a fixed 
computational basis {|j)}-^ = i . ,\e\ 2 - This big register is sub- 
jected to a random unitary rotation U, where U is chosen 
from the Haar measure and subsequently output (a classical 
description of it) in register Bq. All registers A12, ■ ■ ■ , A\u 
are discarded, only An is measured in the computational ba- 
sis, and the result j used to control a unitary transformation 
(Weyl operator) Wj on the environment E, which is then out- 
put in the register Bi. A formal definition can be found in 
the Auxiliary Material [22| . 



A comment on why we need the rather large regis- 
ter Ai, most of which is discarded anyway. In fact, the 
size (parametrized by k) has a double purpose: on the 
one hand, we need An to be close to maximally mixed 
for most inputs. But more importantly, to make it very 
"costly", though not impossible, to use entanglement 
with another system to access the index J\ (see the proof 
of Theorem [TJ . 
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The additivity violation. Now, if we knew that the 
Holevo quantity x(T^) were additive for we would 
have C(Th) = x(Tjs)- Since it is possible to show that 
x(Tv') — xC^O ~ this is a special case of Theorem[T] 
below -, we would have an upper bound 



P(Tfr)<C(Tfr)<C(Af) + o(l). 



(7) 



While we are not able to show additivity for the channels 
Th, the above relation is nevertheless true. In fact, we 
have the following general theorem, proved in full in the 
Auxiliary Material section 22 1. 



Theorem 1 For any channel M with input space A, out- 
put space B and envirnoment E, and any integer k, let 
5{k) = jr(5 + 41og Then, for arbitrary channel £, 

x(M®£)<x(T&®£)<x(M®£)+5{k). (8) 
As a consequence, 

C{M) <C{r^)<C{N) + 8{k). 

On the other hand, we can get a lower bound on 
P{T£®A), where A is the 50% erasure channel of in- 
put dimension c; note that by the no-cloning principle, 
P{A) = Q{A) = 0. Since the private classical capacity is 
not smaller than the quantum capacity, which is in turn 
lower bounded by the coherent information, we evaluate 
the coherent information of the channel Th ® A. Let us 
look at an input state as follows: Alice prepares a maxi- 
mally entangled state <fr AlC of dimension c x c and feeds 
the two halves into the control input (A±) and the 50% 
erasure channel (C; its quantum output we also denote 
C, and the erasure flag D). She feeds another arbitrary 
state p A2 , whose purification is denoted as \(p) AA - 2 . into 
the data input and keeps the system A. I.e., we compute 
the coherent information with respect to the input state 
$ AlC ® p A2 , so that the final state after the 



r AiA 2 C 



channel action is 



AB B 1 B 2 CD 



(id A ®T£® Ac)(<t> Al ° ® ip 2 ) 



When the transmitted information is not erased, Bob will 
be able to correct the errors encountered by the noisy 
channel Af as follows. Bob reads the output B , learning 
which unitary transformation is applied by the channel 
Th. Then he can measure C in the proper basis to get 
j , and then apply W ) to B\ , recovering the environment 
E. As a result, Bob possesses the output and the en- 
vironment of M simultaneously, effectively obtaining the 
quantum information input into M completely. In this 
case, the system B$C is decoupled from AB\B 2 , which 
is in the pure state (1 ® V)\ip) AA2 . So, 



The coherent information, with respect to this state, is 

I c {<j MA2C ',Tfi®A) = H(B 1 B 2 CD\B )-H(AB 1 B 2 CD\B ). 

By an argument similar to that in [2o| . we divide the 
computation into two cases: the information sent into A 
is erased or not erased, 

t ( „A\AiC rfk x~h A \ / rerased , rnot-erased\ 

M CT , -itf <s >i) — -\ i c -t- J- c )■ 

In the erased case, the state is decoupled between AB 2 
and BqBiC, so the coherent information simplifies to 

jcrascd = H ( B j _ H ( AE>2 } 

= H(M( P )) -H ((id ®M)\vM AM )- 



rnot-crasecl 



= H{B 1 B 2 ) - H(AB 1 B 2 ) = H{p A2 ). 



Adding these two cases together, we have 

I c (a, T$®A) = ^ [h( p ) + H(M(p)) - H ((id ® A/>) 



The term in brackets on the right hand side is called 
quantum mutual information (between input and output 
of A/") . In [2l| , it is proved that the maximum over p of 
the right hand side is the entanglement-assisted quantum 
capacity Qe(J\T) of the channel A". I.e., 



Qe(M) 



'A), 



and hence 



P(T£ ®A)> Q{T$ ®A)> Qb(AT). 



(9) 



Now, comparing Eqs. and ([9]), also making use of 
Eq. {I}, we see that for all channels M such that 



Qe{M) > C(N), 



(10) 



we have, for sufficiently large k, 

P(T£ ®A)> Q(T£ ®A)> P(T$) > Q(lfi). (11) 

Note that the channel A has zero private classical capac- 
ity and zero quantum capacity, so Eq. (fTTj) exhibits the 
violations of the additivity of private classical capacity 
and the quantum capacity at the same time. 

All we need now is to find quantum channels that sat- 
isfy Eq. (fTU)) . One example is the depolarizing channel 
of arbitrary dimension d, for which both capacities are 
known [H HJ, V q (p) = (1 - q)p + q\l. For large d, the 
gap becomes asymptotically ^H(q, 1 — q) [22I ]. 

There also exist large additivity violations: In [23I 
Theorem V.l] it is proven that in sufficiently large di- 
mension d, there exist n = [(log d) 4 \ orthogonal bases 



B v = (\b[ ), . . . , \b^ 1 )) such that for all states p, 



n 

- T H (B v \p) > log d- A, 

71 — J 



where H(B V \ P ) = - YLi^WP) log^H^) is 
the Shannon entropy of the outcome distribution when 
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measuring the state p in basis B v . What this means is 
that the channel M from d to dn dimensions, defined as 

n d 

^p) = EE -^w^) \m B ® wm b \ 

v=l »=1 

satisfies xC^O _> 4. Since the channel is entanglement- 
breaking, the additivity result of [l4[ applies, so C(Af) = 
xCA/") 4. On the other hand, it is straightforward to 
see that Qe{N) = ^logd. Thus we find that almost 
the entire bandwidth of Af can be activated by the pres- 
ence of entanglement. Now, to construct the example for 
activation of the secret capacity by a 50% erasure chan- 
nel, we observe that \E\ — dn = d{\ogd) A . We choose 
k = log \E\, and get from Theoremfflthat C(Tfi) < 0(1), 
while at the same time Q{T$ ® -A) > \ logd. Note how- 
ever that the total input dimension is 2°^ logd ^ ), which 
is also the input dimension of the 50% erasure channel. 

Conclusion. We showed a way of converting any gap be- 
tween classical capacity and entanglement-assisted quan- 
tum capacity of a channel into a violation of the additiv- 
ity of the private capacity of the channel tensored with 
a 50% erasure channel. In fact, the quantum capacity 
of the tensor product channel is larger than the classical 
capacity of the single channel. 

The construction is based on a certain embedding of 
the given channel into a version of the echo-correctable 
channels from (ljj . That the pairing with the erasure 
channel gives larger quantum capacity follows from the 
echo-correctable reasoning of the benefit of sharing en- 
tanglement. On the other hand, the upper bound on the 
classical capacity relies on showing that the additional 
"gadgets" built around the given channel increase the 
capacity by an arbitrarily small amount. The argument 
to do so is different from the one proving additivity of 
X of the channel (which we cannot do for Ttr\ and also 
from the use of the recent continuity bound [24| (which 
cannot be applied as Tjj is at finite distance from any 
channel for which we know the capacity). 

Thus, we even get a new type of example for the non- 
additivity of the quantum capacity Q, which is different 
from Smith and Yard's 17| as our channel is not PPT 



entanglement binding. Furthermore, while in 17[ the 
lower bound of half the private capacity on the quantum 
capacity of the tensor product was enough, here we ex- 
perience even a large gap between these two quantities. 
However, we also note a conceptual analogy in the con- 
structions: The PPT entanglement binding channel used 
in derives from a so-called pb it state [jq . It provides 
Alice and Bob with shared randomness - which is made 
private by distributing the purification among Alice and 
Bob, but in a scrambled way that makes it impossible for 
them to recover much of the entanglement. Our channel 
randomizes the environment and hence gives it to Bob 
in an encrypted way, limiting the receiver's knowledge 



about the noise encountered by the channel. In the con- 
struction of [13] as in the present one, the availability of 
additional resources allows Alice and Bob to break the 
encryption and access the entanglement. 
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THE CHANNEL CONSTRUCTION 
(AUXILIARY MATERIAL) 

Mathematically, the channel depicted in Fig. 1 is writ- 
ten 

T${p MA *) = J&U[U] B ° ® J2{W 3 E -* Bl )(V A ^ B2E ) 

(Tr Al [U A ^pU A ^ * \m A ™] ) (V A ^ E )\W B ^ )t, 

(12) 

where the notation [U] denotes a classical label realized 
as mutually orthogonal states. If there arc only countably 
many values of U, these may be thought of as orthogonal 
projectors \U)(U\ on an appropriate Hilbert space. 



PROOF OF THEOREM 1 
(AUXILIARY MATERIAL) 

The left inequality in Eq. (8) is trivial (simply ignore 
the registers A\, B and Bi); also, once the upper bound 
is proved, the inequality for the capacity follows by in- 
duction. Hence we concentrate on the right inequality in 
(8). We shall need a number of auxiliary results. 

Lemma 2 Consider an arbitray state p on C r <g> C s , 
and a fixed basis {\j)}j=i,.... r of C r . Then for a Haar- 
distributed random unitary U G U(C ®C S ), the random 
variable J is defined as follows: 

Pr{J = j\U} = (i|Tr s Uprf\j), 

and it holds that 

EuH(J\U) > logr - log + ^ ^ lo 8' r ~ \ 

Proof . Without loss of generality, the input state 
is some fixed pure state, so that after the unitary 
and before the measurement, we have a uniformly dis- 
tributed random state \<j>). Using the bound H(J) = 

~ Ej p j(J) iogP^Cj) > ~ lo gEj PjU) 2 and thc concav- 
ity of logx, we get now by symmetry 



EuH(J\U) > - logE^O'l Tr s <t>\j) 2 

= -log(rE (l|Tr s 0|l) 2 ) 

= -log(rE4Tr0(|l)(l|' r ®l s )] : 



(13) 



For the latter expectation, we use a well-known trick: 

E [Tr0(|l>(ir ®t s )] 2 

= E^Tr((^® <£)(|l)(l| r ® I s ® |l)(ir ® I s )) 

= Tr ( r^ ^W ® r ® ^ r ® as )) > 

\rs(rs + 1) / 
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where we have introduced a second tensor copy of the 
total Hilbert space C r ®C s , and F is the SWAP operator 
of the two. The last line evaluates easily to 



E [Tr</,(|l)(ir8>l s )] : 



(s 2 + s) 



rs(rs + 1) 
1 s + 1 1 s+1 



r 2 s + 1/r r 2 s 
Inserting this into Eq. (fl3|) concludes the proof. □ 



Remark 3 From the above calculation we see that the 
full unitary invariance of the Haar measure is not re- 
quired; we only need to be able to perform the average 
purity of the reduced state, which is a quadratic function 
of the random pure state tp. Thus, it is sufficient to draw 
U from a so-called unitary 2-design in prime power 
dimension it is known that the Clifford group is an ex- 
ample of a finite 2-design, see e.g. f^l l. 

Now, in the channel we imagine that, just as An, also 
the k — 1 registers Ai 2 , . . . , Aik are measured in a fixed 
basis, resulting in k random variables J±, . . . , J^. For sim- 
plicity, sometimes we write Ji . . . Ju as jjf , and similarly 
for the measurement results, jg. The variables J 2l . . . , Jk 
are being traced over, so the channel does not change 
(See Fig. 1). 

Lemma 4 For any state a AlA2Bs , suppose we feed 
A\ into the channel Th but keep A2 unchanged, let 
lu BoJi A2B ' 3 be the state after applying the random uni- 
tary U and then doing measurements on An, i.e. 



ji-jk < I 

®{]i---]u\U A ^ A ^U A ^\ ]1 ... ]k ). 

Then we have 



I{J i; A 2 B 3 \B ) < -I(J?;A 2 B 3 \B 



fc-i 



(14) 



where 



I(X : Y) = H(X) + H(Y) - H(XY) 

= H{p x )+H{ PY )-H(p XY ) 

is the (quantum) mutual information between two sub- 
systems of a bipartite state pxy with marginals px and 
Py , and the informations conditional on Bq are averages 
over the classical states of this register. 



Proof. We use the chain rule to get 

/(Jf; A 2 B 3 \B ) = /(Ji; A 2 B 3 \B Q ) + 1(4; A 2 B 3 \B Q Ji) 

= I(Jy,A 2 B 3 \B ) + 1(4; A 2 B 3 J 1 \B Q ) 

-/(J i; J 2 fc |B ) 

>I(4;A 2 B 3 \B ) 

+ I(JuA 2 B 3 \Bo) - I(.h; 4\B ). 

Iterating this step for /( ; A 2 B 3 \Bq), then 
I(J 3 ; A 2 B 3 \B a ), etc., we obtain 



I(4;A 2 B 3 \B )>J2nJe;A 2 B 3 \B ) 



A— 1 



(15) 



1=1 



By symmetry all the J( Jf, A 2 B 3 \B ) are the same and 
equal to I{J\] A 2 B 3 \B ), concluding the proof. □ 

Now, to show Eq. (8), we need to do the following. 
Given another channel £, whose input and output reg- 
isters we denote A 3 and B 3 , respectively, we first have 
to show that correlated inputs between A\ and A 2 A 3 are 
(almost) of no use, and hence that the control part of our 
channel Tfr can be skipped. 

Mathematically, we have to look at a given ensemble 
of input states {X x , p AlMA3 } for T$®£. We shall find 
a new ensemble of states only on A 2 A 3 which has almost 
the same Holevo information, even when we consider only 
the output registers B 2 B 3 , i.e. those of A/"® £. It turns 
out that we have to distinguish two cases: An individual 
state ip AlA - 2A: > can result in "small" correlation between 
J 1 and A 2 B 3 - but then Lemma |4] above limits the cor- 
relation between J\ and A 2 B 3 , making input A\ of al- 
most no use (Proposition [5] below) . On the other hand, 
if there is "large" correlation, we can use the Je to break 
up the input state into an ensemble acting only on A 2 A 3 
with at least the same contribution to the Holevo quan- 
tity (Proposition [6]). For the following two Propositions, 
we do the same thing as above, keeping a record of , 
together with the output state on BqB 2 B 3 . However, 
notice that after a unitary U is applied to Ai and then 
the measurements on the An performed, A 2 A 3 collapses 
into a state ip^k (U) A2A3 with probability P jk(U): 



A 2 A a 



(ji...jk\U A ^U A ^\ J1 ... Jk ) 



With respect to which input state cntropic quantities are 
to be interpreted is indicated by adding that input state 
subscript (unless it is ip). 

Proposition 5 For the joint channel Tfr (g> £ with input 
state ip A ^ A 3 and a MA ^ = (id AlA2 <g> £)p A ^ A ^ , if 
I(4;A 2 B 3 \B ) < 41og|J5|, then 

H(B 1 B 2 B 3 \B ) > H(B 2 B 3 ) + log \E\ - 5(k), 

where 5(k) = |(5 + 4 log \E\). 
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Proof. We start by invoking Lemma [2] to the entropy of 
Ji . . . Jk ■ this yields 

H(J e ... J k \B ) > 2(k-£+l)]og\E\ - 



which implies I(J e ; Jj? +1 \B ) < ^(e-i) ■ 

Now, with Lemma U and using the assumption that 
I(Jt,A 2 B 3 \B ) < 4 log |^|, we find 

I(.h-A 2 B 3 \B Q ) < ±I(J*;A 2 B 3 \B ) 



^ fe-i 



i °° 

< 41og|£7| + -£ 



kj^ |£|2(<-1) 

4 4iog| ^i-w 

< i(3 + 41og|S|). 



(16) 



For the given input state <p AlA2j43 , let us write the quan- 
tum state of the system J\A 2 B 3 as ui JlA2B3 (U), and the 
output state on B 1 B 2 B 3 as (f> B ^ B a (U). We also de- 
note the CPTP quantum operation mapping J\A 2 B 3 to 
BiB 2 B 3 in our setting of Th <g) £ as 7?.. Then 



7? w 



(to) = ^ BlB2B3 (u), 



71 



■ l Jl 



\E\ 



By straightforward calculation, and using Eq. (fT6|) and 
Lemma [2] once more, we have 



EuD(lu JiA2B3 (U) 



■t Jl ®oj A2Ba 



\E\ 

= 2log\E\ +I{J 1 ;A 2 B 3 \B ) - H{.h\B a ) 

< 2 log \E\ + i(3 + 41og|J?|) - (2 log \E\ - r^Zt) 

< i(5+41og|£?|) J 

where D(p\\a) := Tr(/o(log/o— logtr)) is the quantum rela- 
tive entropy. By the Lindblad-Uhlmann theorem [28| , the 
quantum relative entropy is monotonic under completely 
positive quantum operations. As a result, we obtain 



log |^|-7?(Bi5 2 B 3 |B ) + H(B 2 B 3 ) 

\E\ 
1 



= E u D[(t> BlB2Bi {U) 
< EuD(lu JiA2B3 (U) 



lB 2 b 3 



\E\ S 



< -(5 + 41og|£|), 



which is exactly to say 

H{B 1 B 2 B 3 \B ) > H(B 2 B 3 ) + log \E\ - ±(5 + 41og 

and we are done. □ 

Proposition 6 For the joint channel TK <g> £ with input 
state tp A ^ A ^ and a A ^ Bs = (id AlA2 (gi £)(p AlA2A ^ , if 
I(Ji ; A 2 B 3 \Bq) > A\og\E\, then there exists a particular 
unitary Uq such that 

H(B 1 B 2 B 3 \B ) > Pjki (U )H Vj ^ Uo) (B 2 B 3 )+log\E\. 

ji—jk 

Proof. By assumption, we have 

41og|£| < I^A^ji-A^Bo) 
= I v> a 1 a 2 a 3 (J^; EB 2 B 3 \B ) 
< I v a iA2 a 3 {J*;B 2 B 3 \B ) + 2 log \E\, 

since the discarding of the register E cannot reduce the 
mutual information by more than 21og|_B|. Thus, 

21og|£| < V^^3(A fe ;5 2 B 3 |Bo) 

= H^As (B 2 B 3 ) — H ip a 1 a 2 a 3 (B 2 B 3 \JiB ). 

(17) 

Hence, there must be a special unitary t/o, such that 
H v a iA2 a 3 {B 2 B 3 \J*B ) > H vAi a 2 a 3 (B 2 B 3 \J*,B = U ) 
= Y,Pj*(Uo)H, Pj ^ Uo) A 2 A 3 (B 2 B 3 ). 
i\ 

(18) 

By the subadditivity of von Neumann entropy, we have 

H^A^iB^BslBo) > H v a 2 a 3 (B 2 B 3 ) - H(Bi\B ) 
>H v a 2 a 3 (B 2 B 3 ) — log \E\. 

(19) 



At last, putting Eqs. (|17H19|) together completes the 
proof. □ 
With that we are now ready for the 

Proof of Theorem 1. For an ensemble of input 
states {X x , (p AlA2A3 } for Tfo ® £ with average state p = 
Y] r \ x <fx, we divide it up into two classes according to 
the above cases: 

Q : = {x% x {JhA 2 B 3 \B Q ) > 41og|£|}, 
C := {x\I^(J*;A 2 B 3 \B ) < 41og|£7|}. 

Then, 

X{\^ x }{Tm®£)=H p {B 1 B 2 B 3 \B ) 

— ^ ^xH V:r (BiB 2 B 3 \Bo) 



xeQ 



(20) 



— ^ XxH V:i; (BiB 2 B 3 \Bo). 
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By the subadditivity of the von Neumann entropy, the 
first term is upper bounded 



and we arc done. 



i/p(BiB 2 B 3 |B ) < HpiB^Bo) + H P (B 2 B 3 \B Q ) 



(21) 



<H p a 2 a 3 (B 2 B 3 ) + \og\E\. 

Second, by Proposition we have for x € C that 

H^B^B^Bo) > H v a 2 a 3 (B 2 B 3 ) + \og\E\ - S(k). 

(22) 

Third, by Proposition [6j for x £ G, there is an ensemble 
decomposition of ip A2A - 3 , 



A 2 A 3 



J2 Vxy^xy 



a 2 a 3 



such that 



\E\ 2 



H^{BiB 2 B 3 \B ) > V fx xy H A 2 A 3 {B 2 B 3 ) + log\E\. 

(23) 

Now define the union ensemble of the above states, 

O = {X x , V X 2A3 } X& C U {K^xyiPxy 1 3 }xeg,y=l,...,\E\™- 

Inserting Eqs. (|2HI23I) into Eq. (J20]) results in 

X{w.}C#®£) <xo(M®£) + S(k), 



DEPOLARIZING CHANNEL 
(AUXILIARY MATERIAL) 



The difference of Q-eC^?) and C(D q ) of the depolariz- 
ing channel T> q , for several special values of d. The fol- 
lowing plot shows q (horizontal axis) against QE(J^q) — 
C(V q ) on the vertical axis. 




